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METHOD AND SYSTEM FOR UNIFIED SPEECH AND GRAPHIC USER 

INTERFACES 
Joseph L. Dvorak 

CROSS-REFERENCE TO RELATED APPLICATIONS 

Not applicable 

FIELD OF THE INVENTION 

[0001] This invention relates generally to user interfaces, and more particularly to a 
system and method for efficiently using speech and graphical user interfaces. 

BACKGROUND OF THE INVENTION 

[0002] In a device with both a speech user interface (SUI) and a Graphical User 
Interface (GUI), the user is typically faced with learning separate command sets for each 
of the SUI and GUI. This increases the difficulty of accurately giving commands, 
especially for the SUI. In addition, speech commands are difficult to remember since 
they are transient and there is no persistent display of them to help the user remember. 
Existing systems fail to provide a mechanism to unify the command sets of the GUI and 
SUI so that the user need only learn a single instance of many of the commands to use in 
both interfaces. 

[0003] U.S. Patent 6,075,534 and others describe methods for displaying a 
dynamically changing menu to control and provide feedback on the operation of the 
speech recognizer in a GUI. However, no reference can be found that ensures that the 
GUI menus and dialog box elements are constructed in such a way that by selecting them 
the user essentially builds the corresponding speech command for an application. 

SUMMARY OF THE INVENTION 

[0004] Embodiments in accordance with the present invention provide mechanisms 
for unifying the command sets of a GUI and a SUI so that the user need only learn a 
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single instance of many of the commands to use in both interfaces. This makes it easier 
for the user to give the correct commands to operate a device. It also provides a 
mechanism to display the corresponding speech command when the user uses the GUI to 
give the command. This further reinforces the speech command in the user's memory, 
reducing the errors in giving the command by speech. 

[0005] In a first aspect according to one embodiment of the present invention, a 
method for unifying speech user interface and graphic user interface commands includes 
the step of receiving grammar specifying a syntax of at least one speech command and 
having semantic information and the step of processing the grammar to extract the 
semantic information for use with both a graphical user interface and a speech user 
interface. The step of processing can include processing the semantic information to 
generate semantic directives used for parsing the grammar between the graphical user 
interface and the speech user interface. The method can further include the step of 
generating graphical user interface elements corresponding to the semantic information in 
the grammar. The method can further include the step of generating speech grammar 
from the grammar and semantic information. Further note that the method can further 
use the grammar and the semantic information to generate visual graphical user interface 
elements required to implement a set of commands contained in the grammar. 
[0006] In a second embodiment, a method for unifying speech user interface and 
graphic user interface commands can include the steps of receiving user entered text via 
a graphical user interface, processing the user entered text via the graphical user interface, 
monitoring the user entered text and adding input context to the user entered text, and, 
updating a speech recognizer with the user entered text and semantic information. The 
step of updating the speech recognizer can include the step of accepting new text 
information and input context to augment and update a speech grammar and recognition 
vocabulary of the speech recognizer. The method can further include the step of updating 
the graphical user interface by updating graphical user interface directives and elements 
to maintain the graphical user interface unified with the speech grammar. The method 
can further include the step of forming a window for displaying a speech interface 
command as it is being built using the graphical user interface. 
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[0007] In a third embodiment, a system having a unified speech user interface and 
graphical user interface can include a display for providing an output for graphic user 
interface elements and a processor coupled to the display. The processor can be 
programmed to receive grammar specifying a syntax of at least one speech command and 
having semantic information and to process the grammar to extract the semantic 
information for use with both a graphical user interface and a speech user interface. The 
processor can be further programmed to generate graphical user interface elements 
corresponding to the semantic information in the grammar. Alternatively or optionally, 
the processor can be programmed to generate speech grammar from the grammar and 
semantic information. The processor can also be programmed to use the grammar and 
the semantic information to generate visual graphical user interface elements required to 
implement a set of commands contained in the grammar. Additionally, the processor can 
be programmed to process the semantic information to generate semantic directives used 
for parsing the grammar between the graphical user interface and the speech user 
interface. 

[0008] In a fourth embodiment, a system having a unified speech user interface and 
graphical user interface can include a display for providing an output for graphic user 
interface elements and a processor coupled to the display. The processor can be 
programmed to receive user entered text via a graphical user interface, process the user 
entered text via the graphical user interface, monitor the user entered text and adding 
input context to the user entered text, and update a speech recognizer with the user 
entered text and semantic information. The processor can further be programmed to 
update the speech recognizer by accepting new text information and input context to 
augment and update a speech grammar and recognition vocabulary of the speech 
recognizer. Alternatively or optionally, the processor can be programmed to update the 
graphical user interface by updating graphical user interface directives and graphical user 
elements to maintain the graphical user interface unified with the speech grammar of the 
speech user interface. Optionally, the processor can be programmed to form a window in 
the graphical user interface enabling the display of a speech interface command as a user 
constructs the speech interface command. 
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[0009] In yet other aspects, embodiments of the present invention can include a 
machine-readable storage having stored thereon a computer program having a plurality of 
code sections executable by a machine for causing the machine to perform the steps 
described above in the method of the first aspect or the second aspect of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0010] FIG. 1 is a flow chart illustrating how a unified GUI and SUI is constructed in 
accordance with the present invention. 

[0011] FIG. 2 is a flow chart illustrating a method of how a unified GUI and SUI is 
maintained in accordance with the present invention. 

[0012] FIG. 3 is an illustration depicting an example of how a unified SUI would 
appear on a GUI in accordance with the present invention. 

[0013] FIG. 4 is an illustration depicting an example of how a unified SUI would 
appear on a GUI having an incremental speech command display while navigating 
through the GUI in accordance with the present invention. 

[0014] FIG. 5 is a block diagram of a portable communication device having a 
unified SUI and GUI in accordance with the present invention. 

DETAILED DESCRIPTION OF THE DRAWINGS 

[0015] Referring to FIG. 1, a method 10 for unifying the specification of the GUI and 
SUI command sets can include a unified grammar 1 2 specifying the syntax of the speech 
commands. This grammar can be textual or machine readable and can be annotated with 
semantic information. The method, process or program 10 can process the unified 
grammar 1 2 and extract the unified semantic information for use by both the SUI and the 
GUI. The extraction process can utilize a unified parser 1 4 that extracts semantic tags at 
block 16, processes the tags using a unification semantics processor at block 18 and 
generates semantic directives at block 20 for use by the unified parser 14. After 
processing the unified grammar and extracting the unified semantic information, the 
method 10 can implement a function or functions that use the interdependencies among 
the rules of the common or unified grammar and the semantic information contained in 
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the grammar annotations to derive a set of menu labels, dialog box labels and choices and 
other GUI elements required to provide the semantically equivalent commands in a 
Graphical User Interface by using blocks 22, 24, and 26. The GUI elements of block 26 
can be visual GUI elements required to implement the commands. At block 28, the 
method or program can also produce the speech grammar, including any speech-only 
semantic annotations. The speech grammar can be compiled at block 30 and provided as 
commands in grammar binary at block 32. 

[0016] Referring to FIG. 2, a method or program 50 (that may be part of any of the 
other programs above) monitors a user's entries 52 in elements of the GUI that accept text 
(whether from a device's keypad/keyboard, a handwriting recognizer, etc) and then 
provides the newly entered information to a speech recognizer 5 1 along with the current 
entry context. The speech recognizer 5 1 accepts the new information and input context to 
augment and update its speech grammar and its recognition vocabulary. In one particular 
embodiment as shown in FIG. 2, the user entered GUI text 52 is processed through a GUI 
element routine 54 to provide text 56 for the unification parser 60 and text for updating a 
GUI element database or file 70 (via the unification parser 60 and block 68) in order to 
provide an updated GUI 72 as will be further explained below. The text 56 used by the 
unification parser 60 along with the input context (62) processed by the unification 
semantics processor 64 provides semantic directives 66 enabling the updating of the 
speech grammar at block 74. The speech grammar can then be compiled at block 76 to 
provide updated grammar binary at block 78. Similarly, the GUI elements can be updated 
to keep them unified with the speech grammar by receiving semantic directives (66) via 
the unification parser 60 to provide updated GUI directives at block 68, an updated GUI 
element at block 70 and an updated GUI at block 72. In other words, blocks 60-78 
represent a program that updates the GUI elements to keep them unified with the speech 
grammar. Also note that the speech recognizer 5 1 can optionally be configured to include 
blocks 74, 76 and 78, although such other arrangements for a speech recognizer 51 are 
certainly contemplated within the scope of the present invention. 

[0017] Referring to FIG. 3, a SUI and SUI unification example 300 is illustrated. The 
speech user interface commands basically match with the graphical user interface 
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elements illustrated. As shown in the example 400 of FIG. 4, the GUI can also include a 
window 402 where the speech interface command is displayed as it is being constructed 
by the user using the GUI, further enhancing the user's memory of the speech commands. 
As previously explained, the system of the present invention reduces the number of 
commands that the user has to remember for the device interfaces since the speech 
interface and graphical interface will share a number of commands in common. This 
makes the device much easier to use by speech since every time the user uses the GUI, he 
will be reminded of the command for the speech interface. 

[0018] A unified SUI and GUI can be used on or in conjunction with any number of 
devices or systems to provide a user with the advantages described above. For example, a 
system 500 as shown in FIG. 5 can be or include a laptop computer, a desktop computer, 
a personal digital assistant, a mobile telephone, an electronic book, a smart phone, a 
communication controller, or a portable handheld computing/communication device. A 
communication controller can be a device that does not (by itself) directly provide a 
human recognizable output and does not necessarily include a display, speaker, or other 
output device. 

[0019] The system 500 of FIG. 5 in particular illustrates a block diagram of a portable 
communication device such as a mobile telephone having a processor 5 1 2 programmed to 
function in accordance with the described embodiments of the present invention. The 
portable communication device can include an encoder 528, transmitter 526 and antenna 
524 for encoding and transmitting information as well as an antenna 530, receiver 532 
and decoder 534 for receiving and decoding information sent to the portable 
communication device. The device or system 500 can further include a memory 520, a 
display 522 for at least displaying a graphical user interface, and a speaker 521 for 
providing an audio output. The processor or controller 5 1 2 can be coupled to the display 
522, the speaker 521, the encoder 528, the decoder 534, and the memory 520. The 
memory 520 can include address memory, message memory, and memory for database 
information. Additionally, the system 500 can include a user input/output device(s) 5 1 8 
coupled to the processor 512. The input device 518 can be a microphone for receiving 
voice instructions that can be transcribed to text using voice-to-text logic for example. 
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Of course, input device 5 1 8 can also be a keyboard, a handwriting recognition tablet, or 
some other Graphical User Interface for entering text. 

[0020] As previously described, the processor can be programmed to receive 
grammar specifying a syntax of at least one speech command and having semantic 
information and to process the grammar to extract the semantic information for use with 
both a graphical user interface and a speech user interface. The processor can be 
programmed to receive user entered text via a graphical user interface, process the user 
entered text via the graphical user interface, monitor the user entered text and adding 
input context to the user entered text, and update a speech recognizer with the user 
entered text and semantic information. The processor can further be programmed to 
update the speech recognizer by accepting new text information and input context to 
augment and update a speech grammar and recognition vocabulary of the speech 
recognizer. Alternatively or optionally, the processor can be programmed to update the 
graphical user interface by updating graphical user interface directives and graphical user 
elements to maintain the graphical user interface unified with the speech grammar of the 
speech user interface. Optionally, the processor can be programmed to form a window in 
the graphical user interface enabling the display of a speech interface command as a user 
constructs the speech interface command. 

[0021] In light of the foregoing description of the invention, it should be recognized 
that the present invention can be realized in hardware, software, or a combination of 
hardware and software. A method and system for unifying a speech user interface and 
graphical user interface according to the present invention can be realized in a centralized 
fashion in one computer system or processor, or in a distributed fashion where different 
elements are spread across several interconnected computer systems or processors (such 
as a microprocessor and a DSP). Any kind of computer system, or other apparatus 
adapted for carrying out the methods described herein, is suited. A typical combination 
of hardware and software could be a general purpose computer system with a computer 
program that, when being loaded and executed, controls the computer system such that it 
carries out the methods described herein. 
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[0022] The present invention can also be embedded in a computer program product, 
which comprises all the features enabling the implementation of the methods described 
herein, and which, when loaded in a computer system, is able to carry out these methods. 
A computer program or application in the present context means any expression, in any 
language, code or notation, of a set of instructions intended to cause a system having an 
information processing capability to perform a particular function either directly or after 
either or both of the following a) conversion to another language, code or notation; b) 
reproduction in a different material form. 

[0023] Additionally, the description above is intended by way of example only and is 
not intended to limit the present invention in any way, except as set forth in the following 
claims. 
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